Overlapping fragments method for electronic structure calculation of large systems 
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We present a method for the calculation of electronic structure of systems that contain tens of 
thousands of atoms. The method is based on the division of the system into mutually overlapping 
fragments and the representation of the single-particle Hamiltonian in the basis of eigenstates of 
these fragments. In practice, for the range of system size that we studied (up to tens of thousands 
of atoms) , the dominant part of the calculation scales linearly with the size of the system when all 
the states within a fixed energy interval are required. The method is highly suitable for making 
good use of parallel computing architectures. We illustrate the method by applying it to diagonalize 
the single-particle Hamiltonian obtained using the density functional theory based charge patching 
method in the case of amorphous alkane and polythiophene polymers. 



00 



C/2 



X3 
O 

> 



X 



I. INTRODUCTION 

In the last few decades, density functional theory 
(DFT)i became a method of choice for the calculation 
of the electronic structure of physical systems with a rel- 
atively large number (hundreds to about a thousand) of 
atoms. Within DFT, one has to self-consistently solve 
the Kohn-Sham equations^ for the wave functions ipi and 
energies 
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where Vio-a is the potential of the core ions, Vh is the elec- 
trostatic (Hartree) potential of the electronic charge den- 
sity distribution p{r) and V^c is the exchange correlation 
potential which, under the local density approximation 
(LDA), depends only on charge density at a given point 
in space. 

There is a strong interest to develop methods where the 
cost of solving the system of equations ([T]) would depend 
linearly on the number of atoms in the system. Such meth- 
ods are based either on the representation of DFT equa- 
tions in localized orbital basis sets'^''* or on the division 
of the system into small fragments,5.ifi These methods are 
still computationally demanding due to necessity of eval- 
uating all the wave functions of occupied states in each 
iteration until the self-consistency is reached. 

A different class of (empirical) methods has been devel- 
oped in the semiconductor physics community, where the 
main philosophy is to directly construct the Kohn-Sham 
Hamiltonian [the left hand side of Eq. ([T|)]. In the empiri- 
cal and semiempirical pseudopotential method (EPM and 
SEPM), the total potential is considered as a sum of pseu- 
dopotentials of individual atoms, that are obtained either 
by fitting to the bandstructure of a bulk semiconductor—!^ 
or extracted from ab-initio calculations of the bulk,- Such 
pseudopotentials are then used to construct the Hamilto- 
nian of the nanostructure of interest i^ii^ A more recent ap- 
proach is the charge patching method (CPM),U'l^ where 
the electronic charge density is constructed from charge 
density contributions of individual atoms - so called mo- 
tifs. The motifs are extracted from calculations on small 



prototype systems, where the atoms have a similar bond- 
ing environment as in the system of interest. For a range 
of inorganic and organic semiconducting systems fii"— the 
charge density and the potential obtained from the CPM 
closely match the ones that would be obtained from a 
full self-consistent DFT calculation. The construction of 
the Hamiltonian in the methods mentioned above (EPM, 
SEPM and CPM) is quick and its cost scales linearly with 
the size of the system. 

Once the Kohn-Sham Hamiltonian is constructed, one 
has to solve its eigenvalue problem. For semiconducting 
systems, the spectral region of interest is the one in the 
vicinity of the band gap. Therefore, one needs to solve 
for these electronic states only. This can be achieved us- 
ing the folded spectrum method.— The folded spectrum 
method (implemented in plane wave representation of the 
wave functions) scales linearly with the size of the system 
when a fixed number of states is required. However, in 
many calculations, one is interested in a fixed energy win- 
dow of the order of several k-^T below or above the band 
gap, since this is the spectral region that determines the 
electronic transport properties of the system. The number 
of states in such an energy window also increases linearly 
with the size of the system and consequently the overall 
computational cost within the folded spectrum method 
increases quadratically with the system size. 

In this paper, we present a different strategy for the 
diagonalization of the Hamiltonian. It is based on the 
idea of representing the Hamiltonian in a localized and 
physically well motivated basis. The whole system is di- 
vided into many small fragments, that are not necessarily 
disjoint, and the eigenstates of the fragments are chosen 
as the basis for the representation of the Hamiltonian. In 
some sense, this approach combines the ideas from the lit- 
erature on using the localized basis sets and the division 
of the system into fragments. We will refer to this method 
as overlapping fragments method (OEM). 

We have developed this methodology with a particular 
focus towards its application to understanding the elec- 
tronic states in semiconducting polymer materials. These 
materials are to a large extent disordered and there is a 
strong need for large supercell calculations that would pro- 
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vide reliable information about the degree of localization 
of electronic states, the density of states and eventually 
the electronic transport properties j^^^i^ For such sys- 
tems, the atomic structure can be reliably generated from 
classical molecular dynamics (MD)fiii2i"— but the chal- 
lenge remains to calculate the electronic structure. The 
current method excellently complements our recently de- 
veloped CPM for the construction of the Hamiltonian of 
organic semiconducting materials. 

We present the details of the implementation of the 
methodology in Sec. [Til In Sec. IIII) we illustrate the 
method to diagonalize the Hamiltonian obtained from 
CPM in the case of alkanes and describe the main points 
that one should address when performing such a calcu- 
lation. Finally, in Sec. IIV[ the method is illustrated by 
an application to one of the most widely studied organic 
polymers - poly(3-hexylthiophene) (P3HT). 



II. METHOD AND IMPLEMENTATION 

In this section we describe the details of the OFM and 
its implementation on parallel computers. The input to 
our computation is the atomic structure of the system and 
its potential obtained from CPM, while the output gives 
the transfer integrals Hij^mn = {(l'i'^\H\(l)m'') and the wave 
function overlaps Sij^mn = ('/'P'' I'/'m'') between the pairs 
of states f^l''^ and 4>m^ , which are the z-th wave function of 
the fragment j and the m-th wave function of the fragment 
n. Each fragment consists of a molecule (its choice will be 
discussed later) embedded in a cuboid box. The potential 
is stored on all the tit CPUs available for computation, 
as schematically illustrated in Fig. [T] 

The computation consists of two main parts (Fig. [1]): 
the calculation of basis wave functions and the calcula- 
tion of Hij,mn and Sij^mn- Wc allocate til CPUs to each 
of the fragments, where riL is typically some small number 
(for example tt-l = 8 or 16). The calculation of the basis 
wave functions stemming from a given fragment is per- 
formed as follows. The charge density of the fragment is 
obtained using the charge patching method by adding the 
charge density motifs of each of the atoms in the fragment. 
The Hartree potential of the fragment is then evaluated 
from the solution of the Poisson equation with periodic 
boundary conditions, and the exchange correlation poten- 
tial is obtained from the LDA formula. In such a way, one 
obtains the Hamiltonian of the fragment, which is then 
diagonalized using the ESCAN code,^^ which implements 
the preconditioned conjugated gradient minimization al- 
gorithm with the plane wave basis set. The basis wave 
functions 4>[ stemming from each fragment j are there- 
fore obtained. At this stage, we also calculate Hlc/)^^) 
(where H is the Kohn-Sham Hamiltonian of the whole 
system, not just the fragment), which will be later re- 
quired for the evaluation of -ffy,mn- To achieve this, one 
first has to send the required real space grid values of the 
potential to the CPUs allocated for fragment j. The 



H\(f)f ) operation is then performed using one of the main 
subroutines from the ESCAN code. One should note that 
the speed of the calculation of basis wavefunctions can be 
further improved by using some localized basis set instead 
of plane waves. 

Let n-p be the number of fragments in the system and 
k — [riT / '^l] • We divide the fragments into ngp = \n-p / 
series, as illustrated in Fig. [T] The calculations on the 
fragments from the same series are performed in paral- 
lel, where each fragment uses its tzl allocated CPUs (see 
Fig. [T]). Such calculations are repeated for all nsF series 
of fragments. 

The code for performing the above tasks was written by 
making good use of the existing codes for performing the 
charge patching calculations, solving the Poisson equation 
and the ESCAN code. These were integrated into a single 
code to avoid reading and writing to disk of the input and 
output files, such as charge densities and potentials, which 
can be quite large. For the storage of the calculated basis 
wave functions (jr^, their reciprocal space representation 
is used. Each fragment will have only a few basis func- 
tions, as discussed below, and therefore the required mem- 
ory for their storage is not very big. Each of the existing 
codes has been already parallelized using Message Pass- 
ing Interface (MPI). Further parallelization with respect 
to fragments, described above, was achieved by using the 
mpi_split command and changing the mpi_comm_world 
communicator in the existing codes to a local communi- 
cator (among the til CPUs) defined by the mpi_split 
command. 

The main part of the calculation consists of the calcula- 
tion of the transfer integrals Hij^mn and the wave function 

overlaps Sij^mn between the pairs of states (/)|"'^ and (j)m^ . 

(i) 

Since the wave function (f>Y is well localized to the frag- 
ment J, one naturally introduces an approximation to con- 
sider only the transfer integrals and wave function over- 

laps for the states ipf and 0™ , such that the fragments j 
and n are not too distant in space. The exact criterion for 
this will be formulated later in the paper. Let np be the 
number of pairs of fragments {j, n} for which Hij^mn and 
Sij^mn need to be evaluated. For each pair, we allocate riL 
CPUs where the calculation is performed. The pairs are 
divided into ngp — [np/fc] series, in a similar manner as 
fragments (Fig. [Ij . To perform the calculation of Hij^mn 
and Sij^mn for pair {j, n} on its allocated CPUs, one 

(i) (1) (n) 

needs to receive the wave functions (/>[ , H(p~ , cjim and 

H(j)m\ which are stored on different groups of riL CPUs, 
the ones associated with fragments j and n. With the 
wave functions available on the allocated group of CPUs, 
the overlap element Sij^mn is straightforwardly calculated 

from the overlap of (j)f^ and 0m'', while Hij^mn is calcu- 
lated as the overlap of either (j)^^'^ and HcjJ'm or 
and (jy'm ■ In a similar manner as for fragments, the cal- 
culations for pairs from the same series are performed in 
parallel (Fig. [1]), and then sequentially repeated for all 
nsp series of pairs. 
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FIG. 1. The scheme describing the implementation of the OFM on parallel computers. 



With Sij^mn and Hij „in elements at hand, the final step 
is to find the electronic states by solving the generalized 
eigenvalue problem 

^ ^ {-H^ij .mn E Sij ^rnn^ C^rin ^ 0. (2) 
ran 

As will be shovifn, a limited number of basis wave functions 
is sufficient for rather accurate results. Therefore, the 
dimension of the matrices Sij^mn and Hij^mn is not very 
large. Consequently, in our current implementation of the 
methodology, this part is performed as a postprocessing 
step by using the standard LAPACK— single processor 
routines. For very large systems or basis sets, we use 
ScaLAPACKfSi, One can in principle also exploit the fact 
that the matrices iJ^^mn and Sij^mn are sparse and use 
FARFACK^^ which is well suited in that case. 

We note that a method exploiting to some extent sim- 
ilar ideas has been recently proposed by McMahon and 
Troisi,^** in the context of the calculation of electronic 
structure of semiconducting polymers. Their method is 
also based on the partitioning of the system into fragments 
and calculating the transfer integrals and basis wave func- 
tion overlaps. In their method, the transfer integrals are 
however evaluated from the calculation of the system of 
two fragments in vacuum, which is inevitably an approxi- 
mation. In our case, on the other hand, the transfer inte- 
grals are evaluated from the Kohn-Sham Hamiltonian of 



the whole system. Such transfer integrals therefore fully 
include all the other environmental factors surrounding 
the two fragments. In the case of disordered polymers 
we therefore do include the variations of both on-site en- 
ergies and hopping integrals due to random electrostatic 
potential. 

III. EXAMPLE OF THE CALCULATION: 
DISORDERED ALKANES 

In this section, we would like to illustrate the method- 
ology by applying it to the alkane polymer system. As a 
test system, we choose 20 alkane chains, each one being 
20 monomers long (1240 atoms altogether). We generate 
the atomic structure of the disordered chain from classical 
MD using a simulated annealing procedure. The CFF91 
force field,^^'^ as implemented in the LAMMFS code^^i^ 
is used in the simulation. 



A. Choice of fragments 

The main task necessary to successfully apply the de- 
scribed methodology is to find the best way for the di- 
vision of the system into fragments. A seemingly nat- 
ural way is to cut the polymer into monomers, and to 
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passivate the broken bond in each monomer by the hy- 
drogen atom. In the case of alkanes, this would lead to 
the division of each n-units long alkane chain into n CH4 
molecules. There is however a concern whether the ba- 
sis set formed from the eigenstates of monomer fragments 
would be sufficient to reliably describe the wave function 
of the whole system, in particular in the region of the bro- 
ken bond. A way around this problem is to choose a basis 
set formed from overlapping dimer fragments, illustrated 
in Fig. [5^. In such a way n— units long alkane chain is 
divided into {n — 1) C2llg molecules. The main advantage 
of this way of the division of the system into fragments 
is that each bond in the system is fully encompassed by 
at least one fragment. We expect further improvement in 
the results when the system is divided into trimer frag- 
ments (C3H8 molecules). On the other hand, one should 
keep in mind that the choice of the fragments that are too 
large is not advantageous from the computational point 
of view, as the wave functions of all fragments need to be 
evaluated. 




FIG. 2. (Color online) (a) A disordered chain of C20H42 and its 
division into overlapping dimer fragments. Several first frag- 
ments are shown only, (b) An amorphous system consisting 
of 20 C20H42 chains. Hydrogen atoms have been removed for 
clarity. 

To test the accuracy of the methodology, we have also 
solved the eigenvalue problem of the whole Hamiltonian 
in the plane wave basis set with kinetic energy cutoff of 
60 Ry, using the ESCAN code. In Fig. [Sj we compare 
the eigenenergies of all occupied states obtained with the 



plane wave basis set i?pw and the eigenenergies obtained 
with the basis of fragment wave functions Ep^. We in- 
clude in the basis set all occupied states of the fragment, 
which constitutes four, seven and ten states for the cases 
of monomer, dimer and trimer fragment, respectively. As 
one might have expected, the results from using the basis 
of monomer fragments are not so accurate. In the low- 
est part of the spectrum, they are shifted by more than 
200 meV from the results obtained in plane wave basis, 
while in the part of the spectrum near the top of the va- 
lence band the errors are of the order of 1 eV. The basis of 
dimer fragments is already quite satisfactory with eigen- 
value errors in the 10 meV range in the lowest part of the 
spectrum, and in the 30 meV range near the top of the 
valence band. The basis of trimer fragments gives excel- 
lent results with errors less than 1 meV in the lowest part 
of the spectrum and errors less than 10 meV near the top 
of the valence band. 
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FIG. 3. (Color online) The comparison of eigenenergies of 
amorphous alkane system calculated in the basis of fragment 
wave functions iJpR and plane waves Epw- The basis set con- 
sisting of all occupied states of the fragments was used in the 
calculation. All occupied states are shown in the figure. The 
Fermi level is at around 5 eV. 



Based on the results presented so far, we can conclude 
that the eigenfunctions of trimer fragments are an excel- 
lent basis for the representation of the Hamiltonian of the 
system. Since all the occupied states are calculated ac- 
curately, one can also imagine of using this approach for 
a full self-consistent DFT calculation without the use of 
the CPM. However, for the present purpose, there is a 
strong interest to reduce the basis set as much as possi- 
ble, since the reduction of the number of wave functions 
per fragment by a factor of K, reduces the time for their 
calculation by a factor of if, reduces the number of wave 
function overlaps and overlap integrals that need to be 
calculated by a factor of and reduces the computa- 
tional time for the final diagonalization step by a factor 
of K^. 
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B. Choice of the number of basis states per 
fragment 

The selection of fragment eigenstates which will be in- 
cluded in the basis set is based on physical intuition. For 
the lowest part of the spectrum, one expects that taking 
just the few lowest states of the trimer (whose eigenstates 
are shown in Fig. |4]) should give quite accurate results. 
One can see from Fig. [S] that even a single wave function 
per trimer gives quite satisfactory results, with a system- 
atic error of the order of 30 meV only. This is not so 
surprising since the lowest state of the trimer is separated 
from the next one by about 3 eV (see Fig. HJ and there- 
fore it is the only state that strongly contributes to the 
wave functions in the lowest part of the spectrum. With 
the inclusion of more states, the results converge towards 
the results obtained in plane wave basis. This is fully ex- 
pected from the variational principle that states that the 
energy of any state formed from the finite basis set must 
be larger than the exact one and converges towards the 
exact one as the space spanned by the basis set is further 
increased. 
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FIG. 5. (Color online) The comparison of eigenenergies at 
the bottom of the valence band of amorphous alkane system 
calculated in the basis of trimer fragment wave functions -Efr 
and plane waves Epw- The number of the basis wave functions 
taken from each trimer is specified in the legend. 
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FIG. 4. The eigenenergies of the monomer (methane), dimer 
(ethane) and trimer (propane). All occupied states and one 
unoccupied state are shown. 



When one is interested in the part of the spectrum near 
the top of the valence band, one expects, based on physical 
intuition, that these states are formed from the highest oc- 
cupied states of the fragment. However, in this case there 
is no exact principle that requires the states to converge 
(either from the top or bottom) toward the exact values as 
more highest occupied states of the fragment are added to 
the basis set. Indeed, we see from Fig. [H] that the results 
obtained with one and five basis states per fragment are on 
the opposite side of the line with exact results. Further- 
more, the results obtained with two, three or four highest 
occupied states per fragment, appear to have large basis 
set superposition errors and yield a completely different 
spectrum, where it is not even possible to correlate the 
eigenstates with the exact ones (these results are there- 



fore not shown). The origin of such behavior comes from 
the energy level structure of the trimer fragment (Fig. 2]). 
There are quite a few states near the highest occupied 
state and until all of them are included in the basis set, 
it is not possible to get a good description of the energy 
level structure. 
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FIG. 6. (Color online) The comparison of eigenenergies of 
amorphous alkane system calculated in the basis of trimer frag- 
ment wave functions -Epr and plane waves Epw- The number 
of the highest occupied basis wave functions taken from each 
trimer is specified in the legend. 
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From these results, we raay speculate about the general 
rule for the choice of basis states when one is interested 
in the states at the top of the valence band. One should 
make a cutoff based on energies of fragment states at the 
place where there is a substantial gap in their energies. 
However, it is difficult to predict in advance how many 
HOMOs are necessary. While in the case of alkanes at 
least 5 HOMOs are required to get reasonably accurate 
results (Fig. ini), in the case of thiophenes a single HOMO 
yields quite good results, as shown in Sec. HV] Of course, 
the inclusion of all occupied states certainly leads to a 
good basis set. We find that it is often useful and practical 
to test the basis set convergence on some small systems 
(e.g., a single chain) where direct DFT calculation for the 
whole system is possible, before using the current method 
to calculate large systems. 

It is also of substantial interest to determine which of 
the calculated states will be occupied and which not and 
consequently identify the Fermi level of the system. In the 
case when all occupied states of the fragments are used as 
basis set, it is trivial to occupy the states of the whole 
system based on the number of electrons in the system. 
In the case when only a few HOMOs of the fragments are 
taken into account, there is no such obvious procedure. 
Nevertheless, in practice we find it easy to recognize the 
HOMO - LUMO gap using the following procedure. We 
calculate = {i\H\i), where H is the Hamiltonian of the 
whole system and \i) the HOMO of the fragment. We 
then find the first gap in the density of states above e^. 
That gap corresponds to the HOMO - LUMO gap. All 
states below that gap are then occupied, while the states 
above are empty. 



C. Choice of the distance cutoff 

An important factor that determines the accuracy of 
the calculation on the one hand and its speed on the other 
hand is the choice of pairs of fragments that are taken into 
account. We define the distance between fragments j and 
n as the minimal distance between an atom in fragment j 
and an atom in fragment n. A pair of fragments {j', n) is 
included in the calculation if the distance between them 
is smaller than some cutoff dcut- AH the results presented 
so far have been obtained with dcut (overcautiously) set to 
7A. It is of interest to find the optimal value of dcut which 
gives accurate eigenstates while minimizing the number 
of fragment pairs in the calculation. The dependence of 
the energies in the lowest part of the spectrum on c?cut is 
presented in Fig. [71 which shows that c?cut=5A gives fully 
converged eigenstates. 



D. Dependence of computational time on system 

size 

The computational time for the dominant part of the 
calculation in the described methodology scales linearly 
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FIG. 7. The dependence of the eigenenergies of amorphous 
alkane system in the lowest part of the valence spectrum on 
the cutoff distance dcut between the fragments. The basis set 
with one wave function per fragment is used. 



with the size of the system in the size range considered 
in this paper, if the states in the fixed energy window 
are required, for the following reasons. The number of 
fragments is proportional to the size of the system, while 
the number of basis wave functions per fragment remains 
the same. Therefore, the time necessary to calculate all 
the basis wave functions is proportional to the number of 
fragments and consequently scales linearly with the size of 
the system. Furthermore, the number of fragment pairs 
with distance less than a certain predefined dcut is also 
proportional to the number of atoms. For example, in 
the case of alkanes for dcut = SA, the number of pairs is 
approximately ten times larger than the number of atoms. 
As a consequence, the CPU time for these two parts of the 
calculation scales linearly with the size of the system, as 
shown in Figs. [S] and [HI The final diagonalization step, 
on the other hand, formally scales as with the size of 
the system. However, in the system size range that we 
consider this step takes a much smaller amount of time 
than the first two steps, as demonstrated in Fig. [HI (right 
panel). As a result, in this range of system dimensions, 
the total computational effort scales linearly with system 
size, as can be seen in Figs. [8]and|9l 



IV. APPLICATION TO A CONJUGATED 
POLYMER SYSTEM 

The main motivation behind the development of this 
methodology was the lack of appropriate methods for the 
efficient calculation of the electronic structure of disor- 
dered conjugated polymers, where large supercells are re- 
quired to provide insight into the physical properties of 
the system. Therefore, in this section, we test the appli- 
cability of the method to the calculation of hole states in 
P3HT, a widely studied polymer for applications in or- 
ganic electronics. 
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FIG. 8. The dependence of the CPU time (defined as the 
wall clock time times the number of CPUs) on the size of the 
amorphous alkane system. The line is a fit to the 0{N) depen- 
dence. The calculations have been performed using one basis 
wave function per fragment. The number of CPUs in these 
calculations is typically of the order of 5000. 
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FIG. 9. (Color online) The dependence of the CPU time (de- 
fined as the wall clock time times the number of CPUs) on 
the size of the amorphous alkane system. The basis of ten 
wavefunctions per fragment is used in these calculations. The 
left panel shows the dependence of the time required for the 
calculation of the basis wavefunctions and Hamiltonian matrix 
elements [the line is a fit to the 0{N) dependence]. The right 
panel shows the time required for the solution of the general- 
ized eigenvalue problem [the line is a fit to the 0{N^) depen- 
dence]. The number of CPUs used for the calculation in the 
left panel was typically of the order of 5000, while in the right 
panel it ranged from 100 for the smallest system to 1600 for 
the largest system. This yields wallclock time of the order of 
several hours for the construction of Hamiltonian matrix and 
less than 10 minutes for its diagonalization. 



We compare the eigenenergies obtained using OFM 
with the ones obtained by diagonahzing the CPM Hamil- 
tonian using the plane wave basis set with kinetic energy 
cutoff of 60 Ry (which is done using the ESC AN code). 
For this test, we consider the system of 5 P3HT chains, 
each one being 20 thiophene rings long (which makes 2510 
atoms altogether). The atomic structure of the system 
was generated from classical MD, using a simulated an- 
nealing procedure, as in our previous workiil We make 
a comparison for ten different random realizations of the 
system, differing by initial conditions in MD simulation. 
We choose the basis of overlapping trimer fragments. In 
the fragments the side hexyl chains have been replaced 
by propyl chains. This replacement is motivated by the 
well known fact that wave functions in the region near the 
band edge that determine the electronic properties are lo- 
calized on the main chain and not the alkyl side chains. 
Such a division into fragments would certainly not be suffi- 
cient to describe the part of electronic spectrum where the 
electronic states stemming from alkyl chains contribute to 
the density of states. That region is however far from the 
band gap region and is not of any physical interest. We 
further note that such a replacement by no means implies 
that the presence of hexyl side chains is ignored, in terms 
of their effects on the atomic structure of the system and 
the electrostatic potential in the full system Hamiltonian 
H as constructed using the CPM. 

We performed the test for different sizes of the basis 
set, consisting of n top HOMOs of each fragment (where 
n G {1,2,3,4}). The results gathered from all ten ran- 
dom realizations are presented in Fig. [TUl The results 
obtained with n = I are already quite accurate. The 
eigenenergy error for the states closest to the top of the 
valence band is of the order of 30 meV and increases to 
120 meV as one goes 0.7 eV further away. The results are 
the most accurate for n = 3 when the eigenenergy error 
is in the 10-50meV range. One should note that there is 
no exact principle that requires the eigenenergies to con- 
verge towards the "exact" ones as the basis set is increased 
and therefore there is no guarantee that a larger basis set 
would improve the results. Indeed, we find that for n = 4 
the results become worse than for n = 3 (which can be 
evidenced by a larger dispersion of points and the pres- 
ence of points both below and above the line). Finally, 
as one goes beyond n = 4 certain states enter the band 
gap region and it becomes impossible even to establish a 
correspondence between the eigenstates from plane wave 
calculation with eigenstates from OFM calculation. 

We would like to point out that our methodology 
strongly reduces the size of the basis set needed to rep- 
resent the Hamiltonian of the system and for that reason 
makes the diagonalization part of the calculation the least 
demanding one. In the case of 2510 atom P3HT system, 
the basis of top three HOMOs per fragment consists of 270 
elements and yields eigenenergies with errors in the 10-50 
meV range. On the other hand, if the same system were 
considered using some typical basis of Gaussian orbitals, 
such as 6-31G*, the size of the basis set would be 19720. In 
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the case of alkanes, the gain is somewhat smaller. In our 
method with 5 HOMOs per fragment, that yields eigenen- 
ergy errors below 50 meV, we use 1800 basis wavefunctions 
to represent the 1240 atom alkane system. On the other 
hand, the 6-31G* basis set for the same system consists 
of 7680 wavefunctions. 




7.6 7.8 8 



8.2 8.4 7.6 7.8 8 8.2 8.4 

Epw(eV) 



FIG. 10. (Color online) The comparison of eigenenergies of the 
amorphous P3HT system obtained using the diagonalization of 
the Hamiltonian by the OFM Efb. and in the plane wave basis 
Epw- The straight line is given as a guide to the eye. 



V. CONCLUSION AND OUTLOOK 

We have introduced the OFM for the calculation 
of eigenstates of the single-particle Hamiltonian. The 
method is based on the partitioning of the system into 
mutually overlapping fragments, the representation of the 



Hamiltonian in the basis of eigenstates of these fragments 
and the diagonalization of the obtained generalized eigen- 
value problem. We have illustrated the method by ap- 
plying it to find the eigenstates of the Hamiltonian of 
organic polymers obtained from the CPM. The method 
is expected to be more general - it would be very inter- 
esting to test the method in other systems, such as for 
example inorganic nanostructures, inorganic alloys or any 
other organic structures - either ordered or disordered. 
The method is expected to be especially useful for under- 
standing the properties of electronic states in the near- 
band gap tail of the density of states of statically or dy- 
namically disordered systems, where large statistics is nec- 
essary to get reliable information. In this kind of systems 
the method would provide detailed information about the 
density of states in the tail, the wave function localiza- 
tion properties and consequently the electronic transport 
in the system. Furthermore, the method directly yields 
a parametrization of the Hamiltonian in a localized basis 
set and as such can be used as a starting point to build 
simple, but insightful tight-binding models of disordered 
systems. Finally, the method is naturally parallelizable 
and can make excellent use of parallel computing archi- 
tectures, which have become the dominant paradigm in 
modern computing. 
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